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Q-f Abstract 

Quantile regression predicts the r-quantile of the conditional distribution of a response 
variable given the explanatory variable for r e (0, 1). The aim of this paper is to establish 
the asymptotic distribution of the quantile estimator obtained by penalized spline method. 

■ A simulation and an exploration of real data are performed to validate our results. 
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1 Introduction 

^> I Regression analysis is one of the most important tools used to investigate the relationship 

I between a response Y and a predictor X. Many major studies of regression have been concerned 

with the estimation of the conditional mean function of Y given a predictor X = x. On the other 
_ hand, the estimation of the conditional quantile function of Y given x has gained momentum in 

\ recent years. This analysis is called quantile regression. In quantile regression, the purpose is 

" to estimate an unknown function rjT-{x) that satisfies 

-T P{Y < 7]r{x)\X = X) = T 

X 

■ for a given r G (0, 1). When r = 0.5, r]T-{x) is the conditional median of Y. One established 

advantage of quantile regression as compared to mean regression is that the estimators are 
more robust against outliers in the response measurements. Quantile regression models have 
been suggested by Koenker and Bassett (1978). Many authors have studied quantile regression 
based on the parametric method, its asymptotic theories, the computational aspects and other 
properties, and these developments have been summarized by Koenker (2005) and Hao and 
Naiman (2007). The nonparametric methods for quantile regression have also been studied 
extensively. Many authors have explored the topic in relation to kernel methods, including Fan 
et al. (1994), Yu and Jones (1998), Takeuchi et al. (2006), Kai et al. (2011). On the other 
hand, Hendricks and Koenker (1992) and Koenker et al. (1994) used the low-rank regression 
splines method and the smoothing splines method, respectively. Pratesi et al. (2009) and 
Reiss and Huang (2012) utilized the penalized spline smoothing method. This paper focuses 
on penalized splines. Compared with unpenalized splines and smoothing splines, an advantage 
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of the penalized spline methods is follows. Although the smoothing spline estimator gives the 
predictor with fitness and smoothness, the computational cost to construct the estimator is high. 
In unpenalized regression spline methods, on the other hand, it is known that the estimator 
tends to have a wiggle curve, but the computational cost is lower than that of smoothing spline 
methods. The penalized spline estimator, however, gives the curve with fitness and smoothness 
and its computational cost is lower than that of smoothing spline methods. Thus, penalized 
splines can be considered an efficient technique. 

Previous results of asymptotic studies of nonparametric quantile regressions include the 
following. Fan et al. (1994) showed the asymptotic normality of the kernel estimator. Yu and 
Jones (1998) proposed a new kernel estimator and studied its asymptotic results. He and Shi 
(1994) showed the convergence rate of the unpenalized regression spline estimator. Portnoy 
(1997) discussed asymptotics for smoothing spline estimators. However, the asymptotics for the 
penalized spline estimator of quantile regression have not yet been studied. 

In this paper, we show the asymptotic distribution of the penalized spline estimator for 
quantile regression with a low-rank S-spline model and the difference penalty. The penalized 
spline estimator of ^(x) for a given r is defined as the minimizer of the convex loss function, 
which is the check function pr with an additional difference penalty. To establish the asymptotic 
distribution of the penalized spline estimator, we need to derive two biases (i) the model bias 
between the true function r]T-{x) and the S-spline model, and (ii) the bias arising from using 
the penalty term. By showing the asymptotic form of these two biases, the resulting asymptotic 
bias of the penalized spline estimator can be obtained. Finally, together with the asymptotic 
variance of the estimator, we show the asymptotic normality of the penalized spline quantile 
estimator. 

This paper is organized as follows. In Section 2, we define the penalized spline quantile 
estimator for a given r. In terms of our estimation method, we mainly focus on the penalized 
iteratively reweighted least squares method. Section 3 provides the asymptotic bias and variance 
as well as the asymptotic distribution of the penalized spline quantile estimator. Furthermore, 
the related properties are described. In Section 4, numerical studies are conducted. Related 
discussion and issues for future research are provided in Section 5. Finally, proofs for the 
theoretical results are all given in the Appendix. 

2 Penalized spline estimator in quantile regression 

For a given dataset {(y^, x^) : i = 1, • • • , n}, consider the conditional 100r% quantile of response 
Yi given Xi = Xi as 

P{Yi < 7]riXi)\Xi = Xi) = T, 

where r G (0, 1) and rjrixi) is an unknown true conditional quantile function of Yi given Xi = Xi. 
It is easy to show that the true function satisfies 

?7t-(x) = argmin£^[pT-(F — a{x))\X = x]. 

a 



2 



Here, pr is the check function provided by Koenker and Bassett (1978), given as 

Priu) = u{t — I{u < 0)), 

where I{u < b) is the indicator function of (—00,6). We want to estimate ijrix) using penahzed 
sphne methods. To approximate r]r{x), we consider the i?-sphne model 

K 

k=-p+l 

where B^\x){k = —p+1, • • • , K) are the pth. degree i?-spline basis functions defined recursively 
as 

^ 1 0, otherwise, 

BPix) = "-"^-^ i3r](.) + ^^±^<f(.), 

I^k+p—l l^k+p l^k 

where Kk{k = —p + 1, • • • ,K + p) are knots and h}^{T){k = —p + 1, • • • , K) are unknown pa- 
rameters. We denote B^^\x) as Bf^{x) unless the degrees of i?-sphnes are specified. Details 
and many properties of the B-spline function are clarified by de Boor (2001). The estimator of 
6(r) = (6_p+i(r) • • • 6/^„(r))^ is defined as 

6(r) = (Lp+i(T) ••• bK^ryf 

= argmin | p.(y, - B{x,fb{T)) + ^birfDlD^bir) I , (1) 

b{r) U=l J 

where B{xj) = {B^p^i{xj) ■ ■ ■ BK„{xj))'^ , Ar(> 0) is the smoothing parameter and {Kn + p — 
m) X (i^„+p)th matrix Dm is the mth difference matrix, which is defined as Dm = {d^ij^)ij, where 
dj™"^ = (— l)l*~-'lmC|j_j| for i < J < m + 1, and for otherwise. It is well known that the differ- 
ence penalty in ([1]) is very useful in mean regression and can be regarded as the controller of the 
smoothness of Sr{x) because we can interpret b{T)^ D^Dmb{T) ^ K^"^~^ {s^t^\x)}'^ dx (see, 
Eilers and Marx (1996)). Although Reiss and Huang (2012) used the penalty {s^\x)}'^ dx , 
this penalty contains an integral and hence the computational difficulty for the resulting esti- 
mator grows. Therefore, this paper proposes using b^r)^ DmDmb{T) as the penalty. In fact, 
b(r) is obtained via linear-programming methods, such as simplex methods or interior points 
methods (see Koenker and Park (1996), Koenker (2005)). On the other hand, it is known that 
the iteratively reweighted least squares (IRLS) method is a useful in nonparametric quantile 
regression. The penalized spline estimator obtained via IRLS was also studied and detailed by 
Reiss and Huang (2012). 

Since IRLS is important for obtaining the estimator, we now provide the complete algorithm. 

- (fc) 

For a given A,-, the A:-steps iterated estimator b (r) is defined as follows: 
b^'\r) = (Z^W^'-'^Z + XrDlDmr'z'^W^'^-'^y, 
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where y = {yi ■■■ Vnf , Z = {B^j+p{xi))ij, W'^''^ = diag[it;f ^ • • • w'n'^] and 



(fc) 



r-I{yi-B{xiYil'^ ^\t)<^) ^Tti^-^) r ^^^ ^ 
T \yi-B{xi) h (T))|>a, 

2(y,-S(x,rb^ \t)) 

" (fe— 1) 

T{yi - Bjxjfb (t)) ^T^C^-i)/ 

0<yi-S(xj)^& (rj) < a, 



a 



a 



for small a > and the initial l^^*^). As A; — )• cxd, it can be shown that limk^oob (r) is 
approximately equivalent to the minimizer of ([1]). By using b(r), the penalized spline estimator 
of r]r{x) is defined as 



Ux)= Bf{x)k{T) = B{x)n{r). 

k=-p+l 

3 Asymptotic theory 

In this section, we show the asymptotic property of fj-rix). Then, we assume that the number of 
knots K and smoothing parameter A,- are dependent on n, and we write Kn and \r,n, respec- 
tively. For simplicity, we write = XT,n- We give some assumptions regarding the asymptotics 
of the penalized spline quantile estimator. 

Assumptions 

1. The explanatory X is distributed as Q(x) on [0, 1]. 

2. The knots for the i?-spline basis are equidistantly located as Kk = k/Kn{k = —p + 
1, • • • , Kn + p) and the number of knots satisfies Kn = o(n^/^). 

3. There exists 7 > such that E[\{t - I{Y < r/^(x)))p+'^|X = z] < 00. 

4. The order of the difference matrix is m < p + 1. 

5. The smoothing parameter A„ is a positive sequence such that A~^ is larger than the 
maximum eig envalue of G{t)-^/^ Dj^^DmGir)-^/^ . 

To describe the asymptotic form of fir{x), we introduce the following symbols and notations. 
Define the {Kn +p)th square matrix G = {Gij)ij by 

Gij = / B^p+i{u)B^p+j{u)dQ{u) 
Jo 

and the {K^ + p)th square matrix G{t) as having the (z, j)-component 

Gijir) = / f{r]r{u)\u)B_p+i{u)B^p+j{u)dQ{u), 
Jo 
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where f{y\x) is the conditional density function of Y given X = x. 

Let &*(t) be a best Loo approximation to the true function i]r{x), which satisfies 

sup \r]rix) + 6^(x) - S(x)'6*(t)| = o{K-^p+^^), 

xG(0,l) 



where 



I{a < X < b) is the indicator function of an interval (a, 6) and Brp(x) is the pth Bernoulli 
polynomial(see Zhou et al. (1998)). Next, we use r]*{x) = B{xY'b* {t). 
The penalized spline quantile estimator can be decomposed as 

firix) - rir{x) = fir{x) - r]*{x) + r/*(x) - rir{x) = f]r{x) - r]*{x) + h",{x) + o(K^^^+-^^). 

We investigate the asymptotic distribution of fiT-{x) — r]*{x) in the following Proposition. 

Proposition 1. Letrir{-) G C'p^'^ . Furthermore suppose = ©(n^/^^^^^^^) and Xn = 0{n'^),u < 
{p + m + l)/(2p + 3). Then under the Assumptions, for x G (0, 1), as n ^ oo, 




— {r}.(x) - r,:{x) - b^ix)} ^ iV(0, ^r{x)), 



where 



6^(x) = -^S(x)^(G(r) + (A„/n)Z?^D„)-iZ)^Z)„&*(r) = 0(n-(P+i)/(2p+3))^ 
n 

^r{x) = lim lil^S(x)'^(G(r) + (A„/n)Z)^I)„)-iG(G(r) + {\n/n)DlD^)-^ B{x). 

The following Theorem, which is the main result in this paper, can be obtained straightfor- 
wardly from Proposition [TJ 

Theorem 1. Under the same assumptions as PropositionUl for x € (0, 1), as n ^ oo, 

^{U^) - r]r{x) - 6«(x) - b^ix)} A iV(0, ^rix)), 
where b^{x) and ^r{x) are those given in PropositionUl 

Remark 1 Under the same assumption as Theorem [H the rate of convergence of the mean 
squared error (MSE) of r/T-(x) becomes 

E [{flr{x) - r?,(x)}2] = 0(^-(2p+2)/(2p+3))_ 

This rate is the same as that of the penalized spline estimator in mean regression (see, Kauer- 
mann et al. (2009)). 
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Remark 2 For the unpenalized regression spline quantile estimator, its asymptotic normality 
is obtained through Theorem [T] with An = 0. 

Remark 3 When the true quantile function has a polynomial form ijr^x) = oq + aix + • • • + 
aqx'^{q < p), r]r{x) = r]*{x) is satisfied since the qth polynomial model can be expressed as the 
linear combination of the pth S-spline bases {B^^ : k = —p + 1, • • • , Kn}{see de Boor (2001)). 
Therefore, in this case, the model bias becomes 0, indicating that the regression spline quantile 
estimator is unbiased. We can definitely show that i?[^T-(C/j)|X„] = in the proof of Lemma [2l 

Remark 4 Let ei{i = 1, • • • ,n) be independently and identically distributed as the density 
/e(e) and assume that Xi and Ei are independent. Consider the data {{yi,Xi) : i = !,••• ,n} 
with Yi = r]{xi) + ej. Then the conditional 100r% quantile of Yi given Xi = Xi can be written as 
r]r{xi) = r]{xi)+F^^{T), where F~'^{t) is the 100t% quantile of Sj. For any r G (0, 1), 
7^{p+i)(a;), with which 6"(x) is unchanged by r. Next, we obtain G(r) = f^{F~^{T))G since 
f{rjr{x)\x) is equal to /e(Fg^^(r)). Furthermore, b*{T) can be written as &*(t) = 6* + F^^{t)1, 
where b* is the best L^q approximation of r/(x) defined in the same manner as b*{T) and 1 is a 
{Kn +p) vector with all components equal to 1. Since all components of -Dml are vanishing, for 
r G (0, 1), we have 

6^(2;) = ^ B{xf (g+ DlD^] ' D^Dmb*. 

The asymptotic variance of fir{x) can be written as 

^r{x) = lim an{T)B{xY (g + l\ DlDm) g(g+ l\ DlDm) B{x), 
where 

r(l-r) 

CtnyTj = 1 . 

{/,(F,-i(r))PK„ 

When the sample size is sufficiently large under the same assumptions as Theorem [T] and 
m < p + 1, the influences of r on b^{x) and ^t{x) appear only as l//e(F~"'^(T)) and r(l — 
r)/{/e(F~^(T))}^, respectively. In general, if the density of Ei is symmetrical at e = 0, the 
asymptotic bias and variance of f]T{x) are small at r = 0.5. Figure [U shows l//e(F~^(T)) and 
r(l — t) / {f!;{F~^ (t))}'^ with normal and Cauchy distributions. 

We observe that b^{x) and ^t{x) are smallest at r = 0.5. For ^r{x) near r = or r = 1, the 
effect of T becomes small. 

Remark 5 Claeskens et al. (2009) studied the asymptotics of penalized spline estimators in 
mean regression, with the estimator 7]{x) = B{x)'^b, where b is the minimizer of 

{y - Zb)'{y - Zb) + !\s^^^ {x)fdx. (3) 

Jo 



6 



Figure 1: Plots for 1/ fe{F-\T)){soM) and r(l - T)/{fe{F-\T))}^ (dashed). The left panel 
shows the standard normal distribution and the right panel shows the Cauchy distribution with 
location and scale 0.01. 

Here, s{x) = B{x)'^b,b G M^"+p and fin is the smoothing parameter. They developed the 
asymptotics for fj{x) under two scenarios: (a) Kg = Kq{n, K^, fJ-n), which as given in their paper 
is less than 1, or (b) Kg > 1. Assumption 5 of this paper is equal to the condition Kg < 1. To- 
gether with the approximation property that Xnb{T)^ D^Dmb{T) finK^^'^ {s^\x)}'^ dx , 
the results of this paper can be regarded as the quantile regression version of Theorem 2 (a) of 
Claeskens et al. (2009). 

Remark 6 To construct the penalized spline estimator of rjr^x), we can also use the truncated 
spline Cr{x) = C{x)'^6{t) as an approximation to r]T-{x), where C{x) = [1 x ••• x^ [x — 
I'^if^- ■ ■ ■ {x — i^Kn-if\-], (^)+ = niaxjx, 0}, and 0{t) G 1^^"+^* is an unknown parameter vector. 
Pratesi et al. (2009) obtained the estimator fir{x) = C{x)'^6{t), where 9{t) is the minimizer of 

n 

PriVi - CriXi)) + flnOirfeeiT), (4) 

i=l 

where fin is the smoothing parameter and G = diag[Op_|_i Ik„-i]- By the equivalence prop- 
erty between the i?-spline model and truncated model, there exists a [Kn + p)th square and 
nonsingular matrix L such that B{x) = LC{x). Therefore Cr{x) can be written as 

Cr{x) = C{xfe = B{xfL~^e{T) = B{xfb{T), 

where &(r) = L^^O(t). Furthermore, the penalty term in (jj]) satisfies from Claeskens et al. 
(2009) 

6{rfee{T) = Kl^birf Dl^,D,+Mr) 

The asymptotic distribution of 77r(a:) = C{x)'^8{t) can be obtained by showing that of B{x)'^b, 
where b is the minimizer of 
n 

J2pr {y^ - B{x,fb{T)) + finKl^b{T)'^Dl^^n,+MT). 
i=l 
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Then, the asymptotic distribution of "fjrix) can be obtained using Theorem [T] under m = p + 1 
and An = finKn^. Thus, we obtain the asymptotic distribution of the penahzed truncated sphne 
quantile estimator. 

Remark 7 Under some weakly condition, the local pth polynomial quantile estimator fjrix) 
has an asymptotic order 

(see Fan et al. (1994) and Ghouch and Genton (2009)) and, hence, it can be said that the 
rate of convergence of the pth S-spline quantile estimator and the local pth polynomial quantile 
estimator are the same. We note the bias of these estimators with p = 1. From Fan et al. 
(1994), the asymptotic bias of the local linear quantile estimator is 

biix) = -^^^^ [ z^K{z)dz, 

where i^(z) is the second order kernel function and /i„ is the bandwidth. If is equal to 
then the difference between 6"(x) and 6^(x) is only 

V/(kj,i < X < Kj)Br2 ~ and [ z^K{z)dz. (5) 

~{ \ Kn J Jr 

It is easy to show that Br2(x) = x^ — x+1/6 < 1/5 for a; G [0, 1], while we have z'^KG{z)dz = 1 
for the Gaussian kernel Kg{z) and j^z^KE{z)dz = 1/5 for the Epanechnikov kernel Ke{z). 
Therefore the bias of the regression spline estimator is smaller than that of the local linear 
estimator in this situation. 

4 Numerical study 
4.1 Simulation 

In this section, we show numerical simulation to confirm the performance as well as the asymp- 
totic normality of the penalized spline quantile estimator claimed in Theorem[TJ The explanatory 
Xi is generated from a uniform distribution on the interval [0, 1]. The response Yi is created by 
Yi = r]{xi) + Ei, where r]{x) = sin(27r3;). The errors e^'s are independently distributed via (i) a 
normal distribution with mean and variance (0.1)^, (ii) an exponential distribution with mean 
2 and (iii) a Cauchy distribution with location and scale 0.01. In this simulation, to obtain 
the penalized spline quantile estimator, we use {p,Tn) = (3,2) and {Kn,\n) is given via the 
generalized approximate cross-validation (GACV) discussed by Yuan (2006). For comparison, 
we construct the unpenalized regression spline quantile estimator with linear spline bases (p = 1) 
and the local linear quantile estimator. The penalized spline estimator, regression spline estima- 
tor, and local linear estimator are denoted as P-cubic, R-linear and L-linear, respectively. The 
number of knots of R-linear and the bandwidth of L-linear are given by GACV. 
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Table 1: Results of MISE for n = 100 and n = 1000. All entries for MISE are 10^ times their 
actual values. 



n = 100 


Normal 


Exponential 


Cauchy 


T 


P-cubic 


R-linear 


L-linear 


P-cubic 


R-linear 


L-linear 


P-cubic 


R-linear 


L-linear 


0.01 


11.89 


20.16 


20.81 


5.21 


11.15 


11.78 


4704.18 


6667.28 


4122.68 


0.1 


3.78 


4.55 


5.03 


6.26 


9.85 


12.50 


206.43 


340.47 


289.05 


0.25 


3.23 


3.27 


3.99 


10.72 


16.68 


13.42 


18.46 


42.15 


86.46 


0.5 


2.87 


3.34 


3.60 


20.26 


31.66 


27.92 


18.88 


35.22 


43.33 


n = 1000 


Normal 


Exponential 


Cauchy 


T 


P-cubic 


R-linear 


L-linear 


P-cubic 


R-linear 


L-linear 


P-cubic 


R-linear 


L-linear 


0.01 


1.23 


1.77 


1.94 


0.22 


0.31 


0.31 


160.52 


910.09 


1178.74 


0.1 


0.46 


1.45 


0.67 


1.08 


1.30 


1.08 


19.53 


24.99 


53.93 


0.25 


0.44 


1.84 


0.52 


2.16 


2.77 


1.91 


2.08 


7.08 


2.17 


0.5 


0.12 


0.34 


0.27 


3.66 


5.60 


3.19 


0.20 


4.81 


1.38 



Let 

^ R 100 

MSE,- = - - vAzj)}'', MISE = 100-1 J2 MSE, ' 

r=l j=l 

where zj = j/J, J = 100 and rjr^rizj) is the estimator for the rth repetition. For r = 0.01, 0.1, 0.25 
and 0.5, we calculate the mean integrated squared error (MISE). We then use sample sizes 
n = 100 and 1000 and the number of repetitions R = 1000. 
Next, from P-cubic, we calculate 

where 

l>^,,(x) = t(1 - T)B{x f{Z^RrZ + XnD^D^)-^Z^Z{Z^RrZ + XnD^Dm)-^B{x), 

Rr = diag[fr{'fjT,r{xi)\xi)] and friy\x) is the conditional kernel density estimate given X = x. 
Then we construct the density estimate of Ur = {Ur,i{x), • • • , Ur,R{x)} at x = 0.5 and compare 
with the density of N{0, 1). To obtain fr{y\x) and Ur, the normal kernel and the bandwidth 
discussed by Sheather and Jones (1991) are utilized. 

Table 1 shows the MISE for r = 0.01,0.1,0.25 and 0.5. For P-cubic with normal error, 
the performance of the quantile estimator is good even if r = 0.01. It is well known that the 
Cauchy distribution is a pathological distribution. However, the MISE of P-cubic with the 
Cauchy distribution is sufficiently small, indicating that the quantile estimator is robust. For 
the boundary r, on the other hand, the MISE of the estimators is worse than that with interior 
r. For the normal and Cauchy models, the median estimator has better behavior than those 
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with T = 0.01, 0.1 and 0.25. On the other hand, for the exponential model, the median estimator 
has a larger MISE than f]r{x) with other values of r. The reason for this is that the density /(e) 
of exponential distribution is monotonically decreasing and its peak is at e = 0, which leads to 
many responses 1^'s being dropped near r]{xi) + F^'^ij) with small r. We note the performance 
of the penalized spline estimator for r > 0.5. When a normal or Cauchy error is used, it appears 
that the MISE of fjT-{x) and that of fii^r{x) become similar since Yi\xi has a symmetrical density 
function at r]{xi). For an exponential error, the closer r is to 0, the smaller the MISE of ffrix) 
will become. Overall, P-cubic has better behavior than R-linear and L-linear. However, for the 
exponential distribution and n = 1000, the MISE of L-linear is slightly smaller than that of 
P-cubic. Additionally, the performance of L-linear is slightly superior to that of R-linear. This 
indicates that the variance of L-linear is less than that of R-linear (see Remark 7) . 

In Figure [21 the density estimate of Ur for r = 0.1 and 0.5 and the density of A^(0, 1) for 
each error are illustrated. In all errors, we can see that the density estimate of Uq,^{x) becomes 
close to A^(0, 1) as n increases. For a normal distribution with n = 1000, the density estimate 
C/o.5 and A^(0, 1) are similar. In both errors, we see that the speed of convergence of [/0.5 is faster 
than that of C/0.1. 

Remark 8 We have confirmed the behavior of the penalized splines with p = 1 (P-linear) and 
the regression splines with p = 3 (R-cubic) though this is not shown in this paper for reasons of 
space. The MISE of P-linear and R-cubic are similar to the P-cubic and R-linear, respectively. 
For spline smoothing, it is generally known that the pair of the 'cubic' spline and the second 
difference penalty are particularly useful in data analysis. Therefore we mainly focused on 
(p, m) = (3, 2) in this simulation. 

4.2 Application 

In this section, we apply the penalized spline quantile estimator to real data. In all examples, 
we use (j),m) = (3,2) and {Kn,Xn) is chosen via GACV. 

Figure [3] showed the penalized spline quantile estimators (r = 0.1, •• • ,0.9) for bone mineral 
density (BMD) data. This data was presented by Hastie et al. (2009). Takeuchi et al. (2006) 
applied the kernel estimator to the same data. Compared with Figure 2 (b) of their paper, the 
penalized splines have a somewhat smooth curve. 

Next, the confidence interval of rjrix) is illustrated. The 100a% confidence interval of rjrix) 
based on the asymptotic result of fir{x) is obtained as 




(6) 



where 6^(x), b^{x) and ^t{x) are the estimators of 6"(x), b^{x) and ^t-{x), while 2i_q,/2 is a 
(1 — a/2)th normal percentile. As the estimator of b^{x), 




XnB{xf{Z^RZ + \nDlDm)-^DlD^b{T) 
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Normal, tau=0.1 



Normal, tau=0-5 




-4 -2 2 4 -4 -2 2 4 



Figure 2: The density estimate of Ur{x) for n = lOO(dot-dashed) and n = lOOO(dashed), and the 
density of N{0, l)(soUd). The left panels are for r = 0.1 and the right panels are for r = 0.5. The 
upper, middle and bottom panels are for normal, exponential and Cauchy errors, respectively. 
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is used. We utilize ^t{x) as given in the previous section. As the pilot estimator of rjr (x) in 
b^{x), we construct the {p + l)th derivative of the penalized spline quantile estimator with the 
{p + 2)th i?-spline model. Thus, we obtain ([6]). 

In Figure m the 95% approximate confidence interval of rjo^^{x) for motor cycle impact data 
is drawn. This dataset, with {(yj,Xj) : i = 1, - ■ ■ , 132} was given by Hardle (1990), where yi is 
the acceleration (g) and Xi is the time (ms). For comparison, the 95% approximate confidence 
interval with uncorrected bias of r]o_^{x) defined by 



is shown. The penalized spline estimator of the median has a curve with fitness and smoothness. 
In the area near x = 20, we see that there is a strong correction of the bias of 170.5 (a^)- 

Finally, we compare the median estimator and the mean estimator for Boston housing data, 
with {{yi,Xi) : i = 1, - ■ ■ ,506}, where yi is the median value of owner-occupied homes in USD 
1000s (given by MEDV) and Xi is the average number of rooms per dwelling (denoted RM). This 
dataset is available from Harrison and Rubinfeld (1979). Figure [S] shows the penalized spline 
quantile estimator of 7?o.5(a;)(solid) and the penalized spline estimator 



of the conditional mean ofY: g{x) = E\Y\X = x] (dashed), where /i„ is the smoothing parameter 
chosen by generalized cross-validation. At around x = 5 and the right-hand side of x = 8, the 
behavior of the median estimator and the mean estimator are different. We see that g{x) is 
affected by extreme points, such as (x,y) = (4.97,50) and {x,y) = (8.78,21.9). On the other 
hand, it appears that the infiuence of extreme values is limited for the median estimator. 

5 Discussion 

This paper have discussed the asymptotic theory of the penalized spline quantile estimator. We 
showed the asymptotic bias and variance as well as the asymptotic normality of the penalized 
spline quantile estimator. The results can be regarded as the quantile regression version of the 
Theorem 2 (a) of Claeskens et al. (2009). 

As the further study, we may consider the asymptotic property of the penalized splines 
with multivariate covariate (xi, • • • ,Xd)- Doskum and Koo (2000) have studied the unpenalized 
spline quantile estimator in additive models, but the asymptotic results were not discussed. The 
additive model has the true quantile function as rjrixi, • • • , Xd) = Yli=i Virixi)- The aim is then 
to estimate r]ir{xi) for each i. Similar to the work of Doskum and Koo, we can construct the 
penalized spline estimator in additive models. In this field, the asymptotic results should be 
determined. 

In relation to the serious problem of the nonparametric quantile regression, a phenomenon 
called the "quantile crossing" occurs (see Koenker (2005)). He (1997) and Takeuchi et al. 




g{x) = B{xY {Z' Z + f,nDi,Dm)-^Z' y 
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Penalized spline quantile estimator 




10 15 20 

Age 



Figure 3: BMD data (n = 485) with f]r{x). The sohd hnes are for r =0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 
0.7, 0.8 and 0.9 from the bottom to top. 



95% approximate confidence interval 




O 10 20 30 40 50 

time(ms) 



Figure 4: Motor cycle impact data (n = 132) with f}o.5(x) (dashed), the 95% approximate 
confidence intervals (solid) and the 95% approximate confidence intervals with uncorrected bias 
(dot-dashed) . 
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Mean and median estimator 



LU 




4 5 6 7 



Figure 5: Boston housing data (n = 506) with the mean(dashed) and median(sohd) estimators. 



(2006) studied the original estimation methods of ijr^x) without quantile crossing. However, 
the asymptotics for their estimators have not yet been developed. The asymptotic study of 
the penalized splines without quantile crossing would be an interesting topic for further study. 
In addition, by using the asymptotic results of the penalized spline estimator r]r{x), it may be 
possible to construct the penalized spline quantile estimator without quantile crossing although 
this is beyond the scope of this paper. 

Appendix 

For a random variable C/„, and y[C/„|X„] denote the conditional expectation and 

variance of Un given {Xi,--- ,X„) = {xi,--- ,Xn), respectively. For the matrix A = {aij)ij, 
ll^lloo = maxjj{|aij|}. For random sequence {an} and if an/bn = Op(l), then it is written 

as , 

as a„ ~ On- 

Lemma 1. Let A = {aij)ij be {Kn + p) matrix and let H{t) = G{t) + {Xn/n)D^Dm- Assume 
that Kn —)• oo as n —)• oo, ||^||oo = Op{K^). Then, under the Assumption, \\AG\\oo = 0{K"~^) 
and\\AH{T)-'\\^ = OiK^-^''). 

Lemma [J can be proven similar to Lemma 1 of Claeskens et al. (2009). Then, Assumption 
5 which guarantees Kq < 1 that given in their paper. 

Lemma 2. Let ipr{u) = r — I{u < 0), Ui = yi — B[xi)'^b*(T). Under the same assumption as 
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Theorem d 




i=l 



where W ~ N{0,t{1 - t)G). 
Proof of Lemma [3 Let 



K 



i=l 



We show the asymptotic distribution of Zn by Lyapunov's theorem. First from the fact that 
P{Y < 7]r{xi)\Xi = Xi) = T, we have 

E[lPrm\Xn] = T-E[I{Y,<B{x^fb*{r))\Xn] 

= T-PiY <B{xifb*{T)\X,=Xi) 

= T-P{Y < T]r{xi) + ba{x„T){l + o{l))\X, = X,) 

= -ba{Xi,T)f{r]riXi)\Xi){l+o{l)) 

= oil). 

Therefore we obtain 



E 



n 





2+7 




E[MmXn]} 







n 

' K \ (2+7)/2^ 

<o[ ( — 

n 



The straightforward calculation yields 



\(2+7)/2 

" j \B{x,f6f^^E\\i,rm?^^ + o(\)\X, 



V[Zn\Xn] = ^y^{B{xi)^5YV[MY^-B[x,fh*[r))\Xr:\ 



i=l 



K. 



i=l 



KnT{l-T)5'^G5{l+Op{l)) 
0{Kn) 



So it follows that 




F[Z„|X„](2+7)/2 ^ 

/ /K \(2+7)/2\ 





2+7 




E[tl:rm\Xn]} 







0(1) 
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since 7 > 0. This leads to 



E\Zn\X 

n D 



iV(0,l) 



V[Zn\X 

n\ 

from Lyapnov's theorem. The expectation of Z„ can be calculated as 



E[Zn\Xn] = -J^^B{x,fdE[i,r{Ui)\X„] 



1=1 




B{Xif6ba{Xi, T)f{r]r{Xi)\Xi){l + Op(l)) 



i=l 



= ^M^n B{u)'^5ha{u,T)f{r^r{u)\u)du{l + Op{l)). 

Jo 

From the proof of Lemma 6.10 of Argwall and Studen (1989), for j = —p + 1, • • • , Kn, we have 



Bj{u)baiu,T)f{r]r{u)\u)duil + oil)) = o{K-^P+^^), 



by which ^J nKno{Ki 
Lemma [2] holds. 



-(P+2)^ 



0(1). Consequently, we have E[Zn\Xn\/V[Zn\Xn\ = op{\) and 

□ 



Lemma 3. Let Win = \l Kn/nB^Xi)^ 5{i = 1, • • • ,n) for 6 G IR^"^^'. Then, under the assump- 
tions, 



1=1 



Proof of Lemma Let 



i=l 



Since 



{/(lii < s) - L{ui < 0)}ds 



Xr 



E[{I{Ui <s)- I{Ui < mXn]ds 



{P {Yi < B{xifb*{T) + s\X, = X,) - P{Y, < B{x,fb*{T)\Xi = x,)} ds 



n Jo I \ n 

rB{x,)TS 



X, - X, - P{Y, < B{x,Y b*{T)\X, = X,) \ dt 



n 



f{B{Xifh*{T)\x,)tdt 



'-^f {B{x,fb*{T)\x,) {B{x,fS}^ 
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Therefore we obtain 



2n 

K, 
2 



i=l 



n xT 



S'G{t)S{1 + op{1)). 
Finally, we show = op(l). For i = 1, - ■ ■ ,n, we have 



{I(ui < s) - I(ui < 0)}ds < \ —BixiY 
V n 




Therefore the variance of Rr, can be evaluated as 



V[Rn\Xn] < 



i=l 



{I{ui <s)- I{ui < 0)}ds 



n i=l,--- ,n 




Since E[Rn\Xn] = 0{Kn), we obtain y^V[Rn\Xn]/E[Rn\Xn] = op(l) and, hence. Lemma [3] 
holds. 



□ 



Proof of Proposition [IJ Let 



Un{S) = Yl 



i=l 



Pt Ui 



n 



B{Xi) 6 - Pr{Ui) 



where Ui = yt — B{xi) b*{T). Then the minimizer dn{T) of Un{5) can be obtained as 



n 



Sn{T) = ^ — {b{T)-b*{T)). 

First we show the convergence point Uo{S) of Un{S) for any S G M-^""''^. For the following 
discussion, we introduce the Knight's idntity(see. Knight (1998)): 



(7) 



Pt{u — v) — Pt{u) = —vipriu) + / {/(n < s) — I{u < 0)}ds, 

Jo 

where ipriu) = t — I{u < 0). By using ([7]), we can write Un{6) as 

Un{S) = Uin{S) + U2n{S) + U^niS) + Uin{S), 
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where 

i=l 

U2n{S) = V / {I{Ui <S)- I{U^ < 0)}ds, 



l\.r) 



UiniS) = Xn\l—b*{TfDlD„,S, 
V n 

where Win = \^ Kn/nB{xi)^ 5. Prom Lemma 1, Uin{5) satisfies 

where W ~ A'^(0, t(1 — t)G). Furthermore Lemma 2 and U^niS) yield 
U2n{S) + UsniS) ~ (G(r) + ^DlD„,^ 6. 

Therefore, we obtain 



UniS) ~ UoiS) = -VKnW' 6 + K\l—h*{TYDlD.^5 + { G{t) + ^D^D^, ] d. 



n 2 \ n 

Because Uq{S) is convex with respect to S and has unique minimizer, the minimizer (5„(t) of 
Un{S) converge to So{t) = argmin^{C/o((5)}. This fact is detailed in Pollard(1991), Knight 
(1998) and Kato (2009). Hence we have 

Since f]T-{x) — r]*{x) = B{x)'^{b{T) — 6*(t)), we obtain for x G (0, 1), as n — )• oo, 




{77,(x) - r?;(x) - b^ix)} ^ N{0, <^r{x)) 

by the definition of W. We can confirm with Lemma[T]that ^t{x) = 0(1)- Finally we show the 
asymptotic order of b^{x). Let B^^^x) = {B^^jjj_^{x) ••• i?^J^(x))^. By the properties of the 
derivative of the i?-spline model, we have s']^\x) = d"^Sr{x)/dx"^ = B^p~'^\x)'^ Dmb{T) . 
Therefore we obtain sIp"™] (x)'^{ir;7Dm6*(r)} = 7]i"'\x){l + o(l)) for m < p. Since the 
asymptotic order of B^P-'^^xf {K^ Rmb* (t)} and that of ||i^:;^D„&*(T)||oo are the same as 
0(1), \\D mb (''")||oo — G(-^n"^) ^'^ satisfied for ui 'f^ p. In addition, similar to the proof of 
Theorem 1 of Kauermann et al. (2009), \\Dp+ib* {t)\\oo = 0{Kn^^^^^) is fulfilled. Together with 
Lemma [H we obtain 

6^(x) = -^B{xf (Cir) + ^DlD„,y' DlD,^b*{T) = ©(A.n-^K^™) = 0(n-(^'+^)/(2*'+3)). 
Thus Proposition 2 has been proven. □ 
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Proof of Theorem d Theorem[T]can be proven directly from Propositions 1. Under the condition 
we have 

y^{r}.(rr) - r^^x) - b'^x)} = yJ^iUx) - vA^) - K{x) + o{K~^^+^^) - 
and vW^&?(^) = 0(xAV^^n ^^^^^) = 0(1). This completes the proof. □ 
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